NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Visualizing and Understanding the Internals of Fuzzing

https://doi.org/10.1145/3691620.3695284

Kummita, Sriteja; Zhang, Zenong; Bodden, Eric; Wei, Shiyi (October 2024, ACM)

Full Text Available
Visualization Task Taxonomy to Understand the Fuzzing Internals (Registered Report)

https://doi.org/10.1145/3678722.3685530

Kummita, Sriteja; Miao, Miao; Bodden, Eric; Wei, Shiyi (September 2024, ACM)

Full Text Available
Automated Testing Linguistic Capabilities of NLP Models

https://doi.org/10.1145/3672455

Lee, Jaeseong; Chen, Simin; Mordahl, Austin; Liu, Cong; Yang, Wei; Wei, Shiyi (September 2024, ACM Transactions on Software Engineering and Methodology)

Natural language processing (NLP) has gained widespread adoption in the development of real-world applications. However, the black-box nature of neural networks in NLP applications poses a challenge when evaluating their performance, let alone ensuring it. Recent research has proposed testing techniques to enhance the trustworthiness of NLP-based applications. However, most existing works use a single, aggregated metric (i.e., accuracy) which is difficult for users to assess NLP model performance on fine-grained aspects, such as LCs. To address this limitation, we present ALiCT, an automated testing technique for validating NLP applications based on their LCs. ALiCT takes user-specified LCs as inputs and produces diverse test suite with test oracles for each of given LC. We evaluate ALiCT on two widely adopted NLP tasks, sentiment analysis and hate speech detection, in terms of diversity, effectiveness, and consistency. Using Self-BLEU and syntactic diversity metrics, our findings reveal that ALiCT generates test cases that are 190% and 2213% more diverse in semantics and syntax, respectively, compared to those produced by state-of-the-art techniques. In addition, ALiCT is capable of producing a larger number of NLP model failures in 22 out of 25 LCs over the two NLP applications.
more » « less
Full Text Available
RTL-Spec: RTL Spectrum Analysis for Security Bug Localization

https://doi.org/10.1109/HOST55342.2024.10545408

Miftah, Samit S; Kundu, Shamik; Mordahl, Austin; Wei, Shiyi; Basu, Kanad (May 2024, IEEE)

Full Text Available
DyCL: Dynamic Neural Network Compilation Via Program Rewriting and Graph Optimization

https://doi.org/10.1145/3597926.3598082

Chen, Simin; Wei, Shiyi; Liu, Cong; Yang, Wei (July 2023, ACM)
ECSTATIC: Automatic Configuration-Aware Testing and Debugging of Static Analysis Tools

https://doi.org/10.1145/3597926.3604918

Mordahl, Austin; Soles, Dakota; Miao, Miao; Zhang, Zenong; Wei, Shiyi (July 2023, ISSTA 2023: Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis)

Full Text Available
ECSTATIC: An Extensible Framework for Testing and Debugging Configurable Static Analysis

https://doi.org/10.1109/ICSE48619.2023.00056

Mordahl, Austin; Zhang, Zenong; Soles, Dakota; Wei, Shiyi (May 2023, 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE))

Full Text Available
Fuzzing Configurations of Program Options

https://doi.org/10.1145/3580597

Zhang, Zenong; Klees, George; Wang, Eric; Hicks, Michael; Wei, Shiyi (April 2023, ACM Transactions on Software Engineering and Methodology)

While many real-world programs are shipped with configurations to enable/disable functionalities, fuzzers have mostly been applied to test single configurations of these programs. In this work, we first conduct an empirical study to understand how program configurations affect fuzzing performance. We find that limiting a campaign to a single configuration can result in failing to cover a significant amount of code. We also observe that different program configurations contribute differing amounts of code coverage, challenging the idea that each one can be efficiently fuzzed individually. Motivated by these two observations, we propose ConfigFuzz , which can fuzz configurations along with normal inputs. ConfigFuzz transforms the target program to encode its program options within part of the fuzzable input, so existing fuzzers’ mutation operators can be reused to fuzz program configurations. We instantiate ConfigFuzz on six configurable, common fuzzing targets, and integrate their executions in FuzzBench. In our evaluation, ConfigFuzz outperforms two baseline fuzzers in four targets, while the results are mixed in the other targets due to program size and configuration space. We also analyze the options fuzzed by ConfigFuzz and how they affect the performance.
more » « less
Full Text Available
An empirical assessment of machine learning approaches for triaging reports of static analysis tools

https://doi.org/10.1007/s10664-022-10253-z

Yerramreddy, Sai; Mordahl, Austin; Koc, Ugur; Wei, Shiyi; Foster, Jeffrey S.; Carpuat, Marine; Porter, Adam A. (March 2023, Empirical Software Engineering)

Full Text Available
FIXREVERTER: A Realistic Bug Injection Methodology for Benchmarking Fuzz Testing

Zhang, Zenong; Patterson, Zach; Hicks, Michael; Wei, Shiyi (August 2022, 31st USENIX Security Symposium)

Fuzz testing is an active area of research with proposed improvements published at a rapid pace. Such proposals are assessed empirically: Can they be shown to perform better than the status quo? Such an assessment requires a benchmark of target programs with well-identified, realistic bugs. To ease the construction of such a benchmark, this paper presents FIXREVERTER, a tool that automatically injects realistic bugs in a program. FIXREVERTER takes as input a bugfix pattern which contains both code syntax and semantic conditions. Any code site that matches the specified syntax is undone if the semantic conditions are satisfied, as checked by static analysis, thus (re)introducing a likely bug. This paper focuses on three bugfix patterns, which we call conditional-abort, conditional-execute, and conditional-assign, based on a study of fixes in a corpus of Common Vulnerabilities and Exposures (CVEs). Using FIXREVERTER we have built REVBUGBENCH, which consists of 10 programs into which we have injected nearly 8,000 bugs; the programs are taken from FuzzBench and Binutils, and represent common targets of fuzzing evaluations. We have integrated REVBUGBENCH into the FuzzBench service, and used it to evaluate five fuzzers. Fuzzing performance varies by fuzzer and program, as desired/expected. Overall, 219 unique bugs were reported, 19% of which were detected by just one fuzzer.
more » « less
Full Text Available

« Prev Next »

Search for: All records